Resumen de la data
La muestra consta de 45,000 observaciones de 14 variables.
- 5 categoricas
- 9 numericas
Variables categoricas
| character |
person_gender |
0 |
1 |
4 |
6 |
0 |
2 |
0 |
| character |
person_education |
0 |
1 |
6 |
11 |
0 |
5 |
0 |
| character |
person_home_ownership |
0 |
1 |
3 |
8 |
0 |
4 |
0 |
| character |
loan_intent |
0 |
1 |
7 |
17 |
0 |
6 |
0 |
| character |
previous_loan_defaults_on_file |
0 |
1 |
2 |
3 |
0 |
2 |
0 |
Variables numericas
| cb_person_cred_hist_length |
45000 |
100 |
5.867489e+00 |
3.879702e+00 |
2.00 |
3.00 |
4.00 |
8.00 |
30.00 |
5.00 |
| credit_score |
45000 |
100 |
6.326088e+02 |
5.043586e+01 |
390.00 |
601.00 |
640.00 |
670.00 |
850.00 |
69.00 |
| label |
45000 |
100 |
2.222222e-01 |
4.157443e-01 |
0.00 |
0.00 |
0.00 |
0.00 |
1.00 |
0.00 |
| loan_amnt |
45000 |
100 |
9.583158e+03 |
6.314887e+03 |
500.00 |
5000.00 |
8000.00 |
12237.50 |
35000.00 |
7237.25 |
| loan_int_rate |
45000 |
100 |
1.100661e+01 |
2.978808e+00 |
5.42 |
8.59 |
11.01 |
12.99 |
20.00 |
4.40 |
| loan_percent_income |
45000 |
100 |
1.397249e-01 |
8.721230e-02 |
0.00 |
0.07 |
0.12 |
0.19 |
0.66 |
0.12 |
| person_age |
45000 |
100 |
2.776418e+01 |
6.045108e+00 |
20.00 |
24.00 |
26.00 |
30.00 |
144.00 |
6.00 |
| person_emp_exp |
45000 |
100 |
5.410333e+00 |
6.063532e+00 |
0.00 |
1.00 |
4.00 |
8.00 |
125.00 |
7.00 |
| person_income |
45000 |
100 |
8.031905e+04 |
8.042250e+04 |
8000.00 |
47202.00 |
67048.00 |
95789.50 |
7200766.00 |
48585.25 |
Se observa que el dataset cuenta con la variable “loan_status”, la cual funcionara como nuestra variable objetivo (label)
Frecuencia de variables
Frecuencia de “loan_status”
Insights:
La variable loan_status en su clase mas representativa tiene el 77.8% del total, lo cual sugiere un conjunto desbalanceado pero aceptable.
Frecuencia de “person_gender”
Frecuencia de “person_education
Frecuencia de “person_home_ownership”
Frecuencia de “loan_intent”
Frecuencia de “previous_loan_defaults_on_file”
Distribución de variables numericas
Distribución de person_age
Distribución de experiencia laboral del cliente
Distribución de ingreso del cliente
Distribución de monto del cliente
Distribución de variables numericas respecto a la variable label
Distribución de person_age
Distribución de person_income
Distribución de person_emp_exp
Distribución de loan_amnt
Distribución de loan_int_rate
Distribución de loan_percent_income
Distribución de cb_person_cred_hist_length
Distribución de credit_score
Dispersión de score crediticio en función de la edad del cliente
Test estadisticos
variable Educación del cliente
| person_education |
Associate |
0.2203193 |
0.2129392 |
0.2278349 |
12028 |
2650 |
0.7328516 |
| person_education |
Bachelor |
0.2252407 |
0.2181904 |
0.2324105 |
13399 |
3018 |
0.7328516 |
| person_education |
Doctorate |
0.2286634 |
0.1961828 |
0.2637479 |
621 |
142 |
0.7328516 |
| person_education |
High School |
0.2231039 |
0.2156725 |
0.2306701 |
11972 |
2671 |
0.7328516 |
| person_education |
Master |
0.2176218 |
0.2079895 |
0.2274904 |
6980 |
1519 |
0.7328516 |
variable Tipo de vivienda
| person_home_ownership |
MORTGAGE |
0.1159608 |
0.1113793 |
0.1206635 |
18489 |
2144 |
0 |
| person_home_ownership |
OTHER |
0.3333333 |
0.2488994 |
0.4264342 |
117 |
39 |
0 |
| person_home_ownership |
OWN |
0.0752287 |
0.0659677 |
0.0853421 |
2951 |
222 |
0 |
| person_home_ownership |
RENT |
0.3239773 |
0.3179874 |
0.3300109 |
23443 |
7595 |
0 |
variable Intención del préstamo
| loan_intent |
DEBTCONSOLIDATION |
0.3027292 |
0.2920884 |
0.3135312 |
7145 |
2163 |
0 |
| loan_intent |
EDUCATION |
0.1695619 |
0.1619258 |
0.1774088 |
9153 |
1552 |
0 |
| loan_intent |
HOMEIMPROVEMENT |
0.2630148 |
0.2505808 |
0.2757388 |
4783 |
1258 |
0 |
| loan_intent |
MEDICAL |
0.2781937 |
0.2687126 |
0.2878262 |
8548 |
2378 |
0 |
| loan_intent |
PERSONAL |
0.2014036 |
0.1924088 |
0.2106294 |
7552 |
1521 |
0 |
| loan_intent |
VENTURE |
0.1442640 |
0.1365458 |
0.1522484 |
7819 |
1128 |
0 |
variable monto del préstamo
| 1 |
loan_amnt |
13131 |
3477.991 |
0.2068388 |
29.18000 |
0 |
| 2 |
loan_amnt |
10100 |
6756.866 |
0.1691089 |
22.44444 |
0 |
| 3 |
loan_amnt |
10519 |
10342.012 |
0.2044871 |
23.37556 |
0 |
| 4 |
loan_amnt |
11250 |
18536.944 |
0.3044444 |
25.00000 |
0 |
variable tasa de interés
| 1 |
loan_int_rate |
11362 |
7.15566 |
0.0927654 |
25.24889 |
0 |
| 2 |
loan_int_rate |
13075 |
10.29691 |
0.1665774 |
29.05556 |
0 |
| 3 |
loan_int_rate |
9316 |
11.98020 |
0.1954702 |
20.70222 |
0 |
| 4 |
loan_int_rate |
11247 |
14.91554 |
0.4398506 |
24.99333 |
0 |
variable monto del préstamo en porcentaje del ingreso
| 1 |
loan_percent_income |
11557 |
0.0481994 |
0.1121398 |
25.68222 |
0 |
| 2 |
loan_percent_income |
11683 |
0.0992374 |
0.1317299 |
25.96222 |
0 |
| 3 |
loan_percent_income |
11354 |
0.1568249 |
0.1887441 |
25.23111 |
0 |
| 4 |
loan_percent_income |
10406 |
0.2681722 |
0.4826062 |
23.12444 |
0 |
variable Score
| 1 |
credit_score |
11265 |
563.3879 |
0.2248557 |
25.03333 |
0.2935483 |
| 2 |
credit_score |
11566 |
622.4738 |
0.2257479 |
25.70222 |
0.2935483 |
| 3 |
credit_score |
11235 |
655.5263 |
0.2219849 |
24.96667 |
0.2935483 |
| 4 |
credit_score |
10934 |
691.0974 |
0.2160234 |
24.29778 |
0.2935483 |
variable edad del cliente
| 1 |
person_age |
15934 |
22.89162 |
0.2355968 |
35.40889 |
1.6e-06 |
| 2 |
person_age |
8166 |
25.44808 |
0.2217732 |
18.14667 |
1.6e-06 |
| 3 |
person_age |
10299 |
28.33032 |
0.2151665 |
22.88667 |
1.6e-06 |
| 4 |
person_age |
10601 |
36.32205 |
0.2093199 |
23.55778 |
1.6e-06 |
variable años de experiencia laboral
| 1 |
person_emp_exp |
13627 |
0.2980113 |
0.2383503 |
30.28222 |
1e-07 |
| 2 |
person_emp_exp |
11548 |
2.9471770 |
0.2192587 |
25.66222 |
1e-07 |
| 3 |
person_emp_exp |
9811 |
6.3041484 |
0.2199572 |
21.80222 |
1e-07 |
| 4 |
person_emp_exp |
10014 |
14.3319353 |
0.2059117 |
22.25333 |
1e-07 |
variable añós de historial crediticio
| 1 |
cb_person_cred_hist_length |
14849 |
2.559768 |
0.2319348 |
32.99778 |
0.0002366 |
| 2 |
cb_person_cred_hist_length |
8653 |
4.000000 |
0.2255865 |
19.22889 |
0.0002366 |
| 3 |
cb_person_cred_hist_length |
11737 |
6.460680 |
0.2182841 |
26.08222 |
0.0002366 |
| 4 |
cb_person_cred_hist_length |
9761 |
11.841615 |
0.2091999 |
21.69111 |
0.0002366 |
variable ingreso del cliente
| 1 |
person_income |
11250 |
35268.65 |
0.4037333 |
25.00000 |
0 |
| 2 |
person_income |
11251 |
57158.40 |
0.2205137 |
25.00222 |
0 |
| 3 |
person_income |
11249 |
80046.01 |
0.1721042 |
24.99778 |
0 |
| 4 |
person_income |
11250 |
148805.19 |
0.0925333 |
25.00000 |
0 |